Teaching Stereo Perception to YOUR Robot

نویسندگان

  • Marcus Wallenberg
  • Per-Erik Forssén
چکیده

This paper describes a method for generation of dense stereo groundtruth using a consumer depth sensor such as the Microsoft Kinect. Such ground-truth allows adaptation of stereo algorithms to a specific setting. The method uses a novel residual weighting based on error propagation from image plane measurements to 3D. We use this ground-truth in wideangle stereo learning by automatically tuning a novel extension of the best-first-propagation (BFP) dense correspondence algorithm. We extend BFP by adding a coarse-to-fine scheme, and a structure measure that limits propagation along linear structures and flat areas. The tuned correspondence algorithm is evaluated in terms of accuracy, robustness, and ability to generalise. Both the tuning cost function, and the evaluation are designed to balance the accuracy-robustness trade-off inherent in patchbased methods such as BFP. Wide-angle stereo provides an overview of a scene, even at very short range. The large field of view (FoV) also ensures that the visual fields from different points of view have a high degree of overlap. For these reasons, wide-angle lenses are popular in navigation, mapping and visual object search on robot platforms. However, the radial distortion caused by these lenses complicates the application of traditional stereo algorithms. A common approach to wide-angle stereo is to first attempt to remove radial distortion and then apply a descriptor-based wide-baseline stereo algorithm. An alternative approach is to use simpler matching metrics, and instead leverage correspondence propagation. One such algorithm is the best-first propagation (BFP) algorithm [2], and a recent addition is the generalised PatchMatch algorithm (GPM) [1]. Though these algorithms are more general than stereo algorithms, they have previously been applied to the stereo problem. Since we make use of both the inverse depth and pixel coordinates in our calibration procedure, the effect of errors in these measurements must be taken into account during calibration. We therefore propagate error variances from these measurements into both 3D reconstructions and resulting 2D projections. We then fuse multiple Kinect range scans in a reference view coincident with the left stereo camera. This is done by estimating a disparity distribution for each pixel in this image, and using mean-shift to find the visible surface closest to the camera. We have imaged three indoor scenes, and for each of them calculated 51 full-resolution wide-angle disparity maps. Examples of individual range scans, an intermediate point cloud and the final disparity maps are shown in figure 2 (top row). We use these to tune our extension of the BFP algorithm, which we call coarse-to-fine best-first propagation (CtF-BFP). Novelties are the use of multiple scales, a structure threshold that limits propagation along linear structures, and a sub-pixel refinement step. These novelties add a multitude of parameters, which make manual tuning difficult. We therefore use an automatic tuning procedure, that minimises an objective function that balances on accuracy, coverage and robustness. When measuring performance of the stereo algorithm, we denote the estimated disparity map to be evaluated by D(u,v) defined on the domain V (pixels where disparities have been estimated). Similarly, the groundtruth disparity image is D∗(u,v), and the set of valid ground-truth pixels is V∗. To find a useful set of parameters, we minimise an objective function

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Wallenberg, Forssén: Teaching Stereo Perception to Your Robot

This paper describes a method for generation of dense stereo ground-truth using a consumer depth sensor such as the Microsoft Kinect. Such ground-truth allows adaptation of stereo algorithms to a specific setting. The method uses a novel residual weighting based on error propagation from image plane measurements to 3D. We use this groundtruth in wide-angle stereo learning by automatically tunin...

متن کامل

Motion Capture from Demonstrator's Viewpoint and Its Application to Robot Teaching

In this papec we propose a kind of “teaching by demonstration” method, aiming at its application to humanoid robots at home in the future. The demonstrator’s motion is captured by a pair of stereo cameras mounted on hisher head, locating very close to hisher eyes. By tracking the landmarks attached to the demonstrator’s hand and the working environment, one can estimate not only the demonstrato...

متن کامل

Steroscopic Camera for Autonomous Mini-Robots Applied in KheperaSot League

This paper presents a stereoscopic vision system for the mini-robot Khepera. The vision system performs objects detection by using the stereo disparity and stereo correspondence. The stereoscopic vision system enhances robot’s visual perception ability by grabbing stereo images and analysis 3D objects, while the robot doesn’t need to move. The simple principle of our stereo vision is the less d...

متن کامل

Enabling Depth-Driven Visual Attention on the iCub Humanoid Robot: Instructions for Use and New Perspectives

Reliable depth perception eases and enables a large variety of attentional and interactive behaviors on humanoid robots. However, the use of depth in real-world scenarios is hindered by the difficulty of computing real-time and robust binocular disparity maps from moving stereo cameras. On the iCub humanoid robot, we recently adopted the Efficient Large-scale Stereo (ELAS) Matching algorithm (G...

متن کامل

Design and evaluation of a head-mounted display for immersive 3D teleoperation of field robots

This paper describes and evaluates the use of a head-mounted display (HMD) for the teleoperation of a field robot. The HMD presents a pair of video streams to the operator (one to each eye) originating from a pair of stereo cameras located on the front of the robot, thus providing him/her with a sense of depth (stereopsis). A tracker on the HMD captures 3-DOF head orientation data which is then...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012